[#441] [IMPROVE] Preload embedding model at startup by sahilds1 · Pull Request #461 · CodeForPhilly/balancer-main

sahilds1 · 2026-02-13T20:51:46Z

Title: Preload embedding model at startup for search and file upload

Description

Preload SentenceTransformer model at Django startup before traffic is routed to the application instance
Add tests for the embeddings services by pulling apart the core logic to make testing easier

Related Issue

GitHub Issue #441

Manual Tests

sahildshah•~/github/balancer-main(441-embedding-models⚡)» docker compose up --build                                                                          [11:21:14]

backend-1  | INFO 2026-03-27 15:21:49,284 _client 125 281472847924544 HTTP Request: GET https://huggingface.co/api/models/sentence-transformers/paraphrase-MiniLM-L6-v2/xet-read-token/c9a2bfebc254878aee8c3aca9e6844d5bbb102d1 "HTTP/1.1 200 OK"
Loading weights: 100%|██████████| 103/103 [00:00<00:00, 6017.65it/s]

Automated Tests

sahildshah•~/github/balancer-main(441-embedding-models⚡)» docker compose exec backend pytest api/services/test_embedding_services.py -v                       [11:30:57]
========================================================================== test session starts ===========================================================================
platform linux -- Python 3.11.4, pytest-9.0.2, pluggy-1.6.0 -- /usr/local/bin/python
cachedir: .pytest_cache
django: version: 4.2.3, settings: balancer_backend.settings (from ini)
rootdir: /usr/src/server
configfile: pytest.ini
plugins: django-4.12.0, anyio-4.12.1
collected 19 items

api/services/test_embedding_services.py::test_build_query_authenticated_uses_or_filter PASSED                                                                      [  5%]
api/services/test_embedding_services.py::test_build_query_unauthenticated_uses_superuser_only_filter PASSED                                                        [ 10%]
api/services/test_embedding_services.py::test_build_query_annotates_and_orders_by_distance PASSED                                                                  [ 15%]
api/services/test_embedding_services.py::test_build_query_no_document_filter_when_both_none PASSED                                                                 [ 21%]
api/services/test_embedding_services.py::test_build_query_guid_takes_precedence_over_document_name PASSED                                                          [ 26%]
api/services/test_embedding_services.py::test_build_query_guid_filter_applied PASSED                                                                               [ 31%]
api/services/test_embedding_services.py::test_build_query_document_name_filter_applied PASSED                                                                      [ 36%]
api/services/test_embedding_services.py::test_build_query_empty_string_guid_falls_back_to_document_name PASSED                                                     [ 42%]
api/services/test_embedding_services.py::test_build_query_respects_num_results PASSED                                                                              [ 47%]
api/services/test_embedding_services.py::test_build_query_returns_unevaluated_queryset PASSED                                                                      [ 52%]
api/services/test_embedding_services.py::test_evaluate_query_empty_queryset PASSED                                                                                 [ 57%]
api/services/test_embedding_services.py::test_evaluate_query_maps_fields PASSED                                                                                    [ 63%]
api/services/test_embedding_services.py::test_evaluate_query_none_upload_file PASSED                                                                               [ 68%]
api/services/test_embedding_services.py::test_log_usage_empty_results PASSED                                                                                       [ 73%]
api/services/test_embedding_services.py::test_log_usage_unauthenticated_user_stored_as_none PASSED                                                                 [ 78%]
api/services/test_embedding_services.py::test_log_usage_none_user_stored_as_none PASSED                                                                            [ 84%]
api/services/test_embedding_services.py::test_log_usage_computes_distance_stats PASSED                                                                             [ 89%]
api/services/test_embedding_services.py::test_log_usage_swallows_exceptions PASSED                                                                                 [ 94%]
api/services/test_embedding_services.py::test_get_closest_embeddings_wiring PASSED                                                                                 [100%]

=========================================================================== 19 passed in 1.62s ===========================================================================

Documentation

Updated README with instructions for running backend tests

Reviewers

@taichan03 @amahuli03

Notes

… routed to the application instance

Copilot

Pull request overview

This PR aims to ensure the SentenceTransformer embedding model is loaded during Django startup (before traffic hits the instance) and to make the embeddings search logic more testable by factoring it into smaller functions.

Changes:

Refactors get_closest_embeddings by extracting query building, evaluation, and usage logging into helper functions.
Adds pytest + pytest-django support (requirements + pytest.ini) and new unit tests for embedding service helpers.
Updates GitHub Actions workflow and README to run backend tests.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
`server/requirements.txt`	Adds pytest dependencies to support running backend tests.
`server/pytest.ini`	Configures pytest-django settings/module and python path for the server package.
`server/api/services/test_embedding_services.py`	Adds unit tests for query evaluation and usage logging helpers.
`server/api/services/embedding_services.py`	Refactors embeddings search into `build_query`, `evaluate_query`, `log_usage`, and reworks `get_closest_embeddings`.
`server/api/apps.py`	Attempts to preload the embedding model during Django app initialization via `ready()`.
`README.md`	Documents how to run backend tests inside the backend container.
`.github/workflows/python-app.yml`	Changes CI branch targets and adds dependency install + pytest execution.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Copilot · 2026-03-11T17:56:02Z

server/api/services/embedding_services.py

@@ -32,59 +31,52 @@ def get_closest_embeddings(

    Returns
    -------
-    list[dict]
-        List of dictionaries containing embedding results with keys:
-        - name: document name
-        - text: embedded text content
-        - page_number: page number in source document
-        - chunk_number: chunk number within the document
-        - distance: L2 distance from query embedding
-        - file_id: GUID of the source file
+    QuerySet
+        Unevaluated Django QuerySet ordered by L2 distance, sliced to num_results
    """
-
-    encoding_start = time.time()
-    transformerModel = TransformerModel.get_instance().model
-    embedding_message = transformerModel.encode(message_data)
-    encoding_time = time.time() - encoding_start
-
-    db_query_start = time.time()
-
    # Django QuerySets are lazily evaluated
    if user.is_authenticated:
        # User sees their own files + files uploaded by superusers
-        closest_embeddings_query = (
-            Embeddings.objects.filter(
-                Q(upload_file__uploaded_by=user) | Q(upload_file__uploaded_by__is_superuser=True)
-            )
-            .annotate(
-                distance=L2Distance("embedding_sentence_transformers", embedding_message)
-            )
-            .order_by("distance")
+        queryset = Embeddings.objects.filter(
+            Q(upload_file__uploaded_by=user) | Q(upload_file__uploaded_by__is_superuser=True)
        )
    else:
        # Unauthenticated users only see superuser-uploaded files
-        closest_embeddings_query = (
-            Embeddings.objects.filter(upload_file__uploaded_by__is_superuser=True)
-            .annotate(
-                distance=L2Distance("embedding_sentence_transformers", embedding_message)
-            )
-            .order_by("distance")
-        )
+        queryset = Embeddings.objects.filter(upload_file__uploaded_by__is_superuser=True)
+
+    queryset = (
+        queryset
+        .annotate(distance=L2Distance("embedding_sentence_transformers", embedding_vector))
+        .order_by("distance")
+    )

    # Filtering to a document GUID takes precedence over a document name
    if guid:
-        closest_embeddings_query = closest_embeddings_query.filter(
-            upload_file__guid=guid
-        )
+        queryset = queryset.filter(upload_file__guid=guid)
    elif document_name:
-        closest_embeddings_query = closest_embeddings_query.filter(name=document_name)
+        queryset = queryset.filter(name=document_name)

    # Slicing is equivalent to SQL's LIMIT clause
-    closest_embeddings_query = closest_embeddings_query[:num_results]
+    return queryset[:num_results]


build_query() introduces/relocates important filtering + precedence logic (authenticated vs unauthenticated visibility; guid-over-document_name; LIMIT slicing), but the new tests only cover evaluate_query and log_usage. Add unit/integration tests covering build_query behavior (e.g., guid precedence and the authenticated/unauthenticated queryset filters) to prevent regressions in access control and filtering.

Building on Copilot's comment, the specifics of the QuerySet object's structure aren't publicly documented. To inspect the QuerySets, we should actually execute them.

There's a couple ways we handle DB access for these tests. We could use [pytest-django's ``@pytest.mark.django_db](https://pytest-django.readthedocs.io/en/latest/database.html), which wraps the test in a transaction the rolls back automatically afterwards. Django also has a built-in django.test.TestCase`, which does a similar thing.

Thanks for sharing the docs references -- I added tests for build_query and didn't have to access the database because I was able to inspect which methods and arguments were called on the model ("Embeddings")

Copilot · 2026-03-11T17:56:02Z

server/api/apps.py

+    def ready(self):
+        from .services.sentencetTransformer_model import TransformerModel
+        TransformerModel.get_instance()


ApiConfig.ready() will only run if this AppConfig is actually used by Django. Right now INSTALLED_APPS appears to include just "api" (not "api.apps.ApiConfig"), and api/__init__.py doesn’t set a default config, so this preload hook may never execute. Consider updating INSTALLED_APPS to reference api.apps.ApiConfig (or otherwise ensuring this config is selected) so the model is preloaded as intended.

The model preloads as intended because Django ≥ 3.2 auto discovers AppConfig subclasses in apps.py

Copilot · 2026-03-11T17:56:02Z

server/api/apps.py

+    def ready(self):
+        from .services.sentencetTransformer_model import TransformerModel
+        TransformerModel.get_instance()


Calling TransformerModel.get_instance() unconditionally in ready() will run for every Django startup context (tests, migrations, management commands, autoreload) and can trigger a large model download/init even when no web traffic will be served. Consider gating this preload behind an explicit env flag (or limiting it to the web server entrypoint) to avoid slowing/fragilizing CI and one-off management commands.

I added a guard to only preload the model when we're actually going to serve requests

.github/workflows/python-app.yml

amahuli03

This is good, the refactor of get_closest_embeddings() was definitely needed. For build_query we need some tests. It's a little more complicated than simple unit testing but I've left some docs references in my comments.

I'm not sure how CI will work right now, with preloading the model in ready. Also, once we have test coverage for build_query that may change CI requirements.

…_services.py

Copilot

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-27T17:11:57Z

server/api/services/test_embedding_services.py

+# ---------------------------------------------------------------------------
+
+# Only forwarded to L2Distance
+EMBEDDING_VECTOR = [0.1, 0.2, 0.3]  


Trailing whitespace after the EMBEDDING_VECTOR literal is likely unintentional and can cause noisy diffs/lint failures. Please remove the extra spaces.

Suggested change

EMBEDDING_VECTOR = [0.1, 0.2, 0.3]

EMBEDDING_VECTOR = [0.1, 0.2, 0.3]

Copilot · 2026-03-27T17:11:57Z

server/api/services/embedding_services.py

 import logging
 from statistics import median

+# Django filter() only does ADD logic


The comment says “Django filter() only does ADD logic”, which looks like a typo and is also misleading: queryset filters are combined with AND semantics by default, while OR is possible via Q objects. Consider correcting/removing this comment to avoid confusing future readers.

Suggested change

# Django filter() only does ADD logic

# Use Q objects to express OR conditions in Django queries

Copilot · 2026-03-27T17:11:58Z

server/api/services/embedding_services.py

    except Exception as e:
        logger.error(f"Failed to create semantic search usage database record: {e}")


log_usage() swallows all exceptions, but logger.error(f"... {e}") drops the traceback, making it much harder to debug production failures when SemanticSearchUsage writes fail. Prefer logger.exception(...) (or logger.error(..., exc_info=True)) so the stack trace is captured while still not interrupting the request.

Suggested change

except Exception as e:

logger.error(f"Failed to create semantic search usage database record: {e}")

except Exception:

logger.exception("Failed to create semantic search usage database record")

Copilot · 2026-03-27T17:11:58Z

server/requirements.txt

+pytest
+pytest-django


pytest and pytest-django are added without version pins. Since these packages can introduce breaking behavior across major/minor releases, consider pinning them (or constraining with compatible ranges) to keep test runs reproducible across environments and container rebuilds.

Suggested change

pytest

pytest-django

pytest>=8.0.0,<9.0.0

pytest-django>=4.8.0,<5.0.0

Copilot · 2026-03-27T17:11:58Z

server/api/apps.py

+        # Note: paraphrase-MiniLM-L6-v2 (~80MB) is downloaded from HuggingFace on first
+        # use and cached to ~/.cache/torch/sentence_transformers/ inside the container.
+        # That cache is ephemeral — every container rebuild re-downloads the model unless
+        # a volume is mounted at that path.
+        from .services.sentencetTransformer_model import TransformerModel
+        TransformerModel.get_instance()


Preloading the SentenceTransformer model during AppConfig.ready() can fail due to transient network/cache issues (e.g., HuggingFace downtime), which would now prevent Django from starting and serving any traffic. Consider wrapping TransformerModel.get_instance() in a try/except with logger.exception(...), and optionally allowing startup to continue (falling back to lazy load on first request) or making “fail fast” behavior configurable via an env var.

REFACTOR Pull apart get_closest_embeddings to make testing easier

59b40f0

sahilds1 changed the title ~~REFACTOR Pull apart get_closest_embeddings to make testing easier~~ [DRAFT] [#441] Embedding Models Feb 13, 2026

sahilds1 self-assigned this Feb 13, 2026

sahilds1 added 2 commits February 13, 2026 16:12

ADD Add infra required to run pytest

3ffb74a

ADD Start adding tests for embedding_services"

12b09a7

sahilds1 changed the title ~~[DRAFT] [#441] Embedding Models~~ [WIP] [#441] Embedding Models Feb 13, 2026

sahilds1 added 2 commits February 17, 2026 14:40

DOC Add a note about running pytest in the README

da9afaa

Preload SentenceTransformer model at Django startup before traffic is…

5ce7782

… routed to the application instance

sahilds1 changed the title ~~[WIP] [#441] Embedding Models~~ [#441] Preload embedding model at startup Feb 27, 2026

sahilds1 requested review from amahuli03 and taichan03 March 10, 2026 19:34

sahilds1 marked this pull request as ready for review March 10, 2026 19:37

sahilds1 added 2 commits March 11, 2026 13:06

Merge branch 'develop' into 441-embedding-models

50a8bd3

Run python-app workflow on pushes and PRs to develop branch

795f218

sahilds1 requested a review from Copilot March 11, 2026 17:51

Copilot started reviewing on behalf of sahilds1 March 11, 2026 17:52 View session

Copilot AI reviewed Mar 11, 2026

View reviewed changes

amahuli03 reviewed Mar 13, 2026

View reviewed changes

sahilds1 added 8 commits March 19, 2026 13:15

Pytest won’t automatically discover config files in subdirectories

d498a00

Merge branch 'develop' into 441-embedding-models

6d3d8d1

Suppress E402 import violations

3824d81

Add build_query tests and document coverage gaps in embedding_services

46e9969

Fill test gaps in test_embedding_services

64a19ef

Fix incorrect build_query test assertions

dec3c12

Guard TransformerModel preload to runserver processes only

f9e890a

Revert GitHub Workflow changes

67176a8

sahilds1 changed the title ~~[#441] Preload embedding model at startup~~ [#441] Preload embedding model at startup for search and file upload Mar 25, 2026

sahilds1 mentioned this pull request Mar 26, 2026

[IMPROVE] Embedding model encoding time for search and file upload #441

Open

1 task

sahilds1 added 2 commits March 26, 2026 14:35

Add section header comments to all four test groups in test_embedding…

d273921

…_services.py

Document why tests are split by responsibility

8198574

sahilds1 changed the title ~~[#441] Preload embedding model at startup for search and file upload~~ [#441] [IMPROVE] Preload embedding model at startup for search and file upload Mar 26, 2026

sahilds1 changed the title ~~[#441] [IMPROVE] Preload embedding model at startup for search and file upload~~ [#441] [IMPROVE] Preload embedding model at startup Mar 26, 2026

sahilds1 requested a review from Copilot March 27, 2026 17:08

Copilot started reviewing on behalf of sahilds1 March 27, 2026 17:08 View session

Copilot AI reviewed Mar 27, 2026

View reviewed changes

	EMBEDDING_VECTOR = [0.1, 0.2, 0.3]
	EMBEDDING_VECTOR = [0.1, 0.2, 0.3]

	# Django filter() only does ADD logic
	# Use Q objects to express OR conditions in Django queries

		except Exception as e:
		logger.error(f"Failed to create semantic search usage database record: {e}")

-pytest
-pytest-django
+pytest>=8.0.0,<9.0.0
+pytest-django>=4.8.0,<5.0.0

Uh oh!

Conversation

sahilds1 commented Feb 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Title: Preload embedding model at startup for search and file upload

Description

Related Issue

Manual Tests

Automated Tests

Documentation

Reviewers

Notes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

amahuli03 Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

sahilds1 Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

sahilds1 Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

sahilds1 Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

amahuli03 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

sahilds1 commented Feb 13, 2026 •

edited

Loading

sahilds1 Mar 25, 2026 •

edited

Loading

amahuli03 left a comment •

edited

Loading